ELITE Portal Tutorial: Joining Metadata

Author

Melissa Klein (Sage Bionetworks)

Published

March 12, 2025

Install Synapser if you have not already

install.packages("synapser", repos = c("http://ran.synapse.org"))
install.packages(c("tidyverse", "lubridate"))

Load libraries

library(synapser)
library(readr)
library(dplyr)
library(magrittr)

Log in to Synapse

synLogin()

Download and read in metadata files

There are 3 metadata files that you will want to have access to and potentially join together to understand data you are looking at on the ELITE Portal: Individual, Biospecimen, and Assay.

This example utilizes the three metadata files that can be found in the Study Details Page for the Mouse M005 Metabolomics Study.

The query below is generated when you download programmatically directly from the ELITE Portal.

# Download the results of the filtered table query
query <- synTableQuery("SELECT * FROM syn52234677 WHERE ( ( \"Study\" = 'Mouse_M005_Study_Metabolomics' ) ) AND ( `resourceType` = 'metadata' )")
Downloaded syn52234677 to /Users/mklein/.synapseCache/745/153908745/SYNAPSE_TABLE_QUERY_153908745.csv
read.table(query$filepath, sep = ",")
ABCDEFGHIJ0123456789
V1
<chr>
V2
<chr>
V3
<chr>
V4
<chr>
dataRestrictionidnameindividualCount
opensyn61348404syn613484040
opensyn64020472syn640204720
opensyn64020473syn640204730
# View the file path of the resulting csv
query$filepath
[1] "/Users/mklein/.synapseCache/745/153908745/SYNAPSE_TABLE_QUERY_153908745.csv"

Now that the files have been downloaded, we can read them into R.

# Individual metadata
individual_metadata <- read_csv("files/individual_non_human_M005_Longevity Consortium_11-11-2024_final.csv", show_col_types = FALSE)

# Biospecimen metadata
biospecimen_metadata <- read_csv("files/biospecimen_non_human_M005_Longevity Consortium_11-11-2024_final.csv", show_col_types = FALSE)

# Assay metadata
assay_metadata <- read_csv("files/synapse_storage_manifest_assaymetabolomicstemplate.csv", show_col_types = FALSE)

Join Metadata

Now, we join the metadata files together using left joins, matching on specimenID, then on individualID.

# join all the rows in the assay metadata that have a match in the biospecimen metadata
joined_meta <- assay_metadata |> 
  
  #join rows from biospecimen that match specimenID
  left_join(biospecimen_metadata, by = "specimenID") |>
  
  # join rows from individual that match individualID
  left_join(individual_metadata, by = "individualID")

joined_meta
ABCDEFGHIJ0123456789
Filename.x
<chr>
staging_mouse_metabolomics/processed/longevity_mouse_gastroc_unified_all.csv
staging_mouse_metabolomics/processed/longevity_mouse_gastroc_unified_knowns.csv
staging_mouse_metabolomics/processed/longevity_mouse_gonadal_unified_all.csv
staging_mouse_metabolomics/processed/longevity_mouse_gonadal_unified_knowns.csv
staging_mouse_metabolomics/processed/longevity_mouse_inguinal_unified_all.csv
staging_mouse_metabolomics/processed/longevity_mouse_inguinal_unified_knowns.csv
staging_mouse_metabolomics/processed/longevity_mouse_kidney_unified_all.csv
staging_mouse_metabolomics/processed/longevity_mouse_kidney_unified_knowns.csv
staging_mouse_metabolomics/processed/longevity_mouse_liver_unified_all.csv
staging_mouse_metabolomics/processed/longevity_mouse_liver_unified_knowns.csv

Congratulations! You have now bulk downloaded and joined metadata files!